Standards covered by or related to COMBINE activities
One of the major goals of COMBINE is to improve the interoperability of existing standards, and to foster or support fledging efforts aimed at filling gaps or new needs. Below are listed some of the major community standard representation formats covered by or related to COMBINE activity.
COMBINE standards
The following standardization activities are open community efforts. The standards are described in freely available specifications, and associated tools (XML schemas, UML diagrams etc.). They are piloted by democratically elected editorial boards, sometimes assisted by scientific committees. A decent software support exist, including API implementations. The development is supported by central teams and/or funding sources. The different formats try to avoid overlapping but rather strive to interoperate, via interconversion, cross-linking, use of common metadata layers etc.
A comprehensive list of specification documents is also available, following the COMBINE specification infrastructure.
BioPAX |
BioPAX is a standard language that aims to enable integration, exchange and analysis of biological pathway data. It is expressed in OWL. The last specification is BioPAX Level 3. BioPAX development is coordinated by an elected editorial board and a Scientific Advisory Board. BioPAX is supported by many pathway database or processing tools. An API is available to help implementing support: Paxtools |
The Systems Biology Graphical Notation (SBGN), is a set standard graphical languages to describe visually biological knowledge. It is currently made up of three languages describing Process Descriptions, Entity Relationships and Activity Flows. The last specifications are SBGN PD Level 1 Version 2.0, SBGN ER Level 1 Version 2 and SBGN AF Level 1 Version 1.2. SBGN development is coordinated by an elected editorial board and a Scientific Committee. Several data resources and software claim support for SBGN. An API is available to help implementing support: libSBGN |
The Systems Biology Markup Language (SBML) is a computer-readable XML format for representing models of biological processes. SBML is suitable for, but not limited to, models using a process description approach. The latest stable specification is Level 3 Version 2 Core. SBML development is coordinated by an elected editorial board and central developer team. Over 250 software systems known to support SBML can be found in the SBML software guide. APIs are available to help implementing support: libSBML in C++ and JSBML in Java. |
The Simulation Experiment Description Markup Language (SED-ML) is an XML-based format for encoding simulation experiments. SED-ML allows to define the models to use, the experimental tasks to run and which results to produce.is a computer-readable format for representing the models of biological processes. SED-ML can be used with models encoded in several languages, as far as they are in XML. The latest stable specification is Level 1 Version 3. SED-ML development is coordinated by an elected editorial board. APIs are available to help implementing support: libSedML in C#, libSEDML in C++ with swig bindings for python, java, perl, R and ruby, and jlibsedml in Java. |
The CellML language is an XML markup language to store and exchange computer-based mathematical models. CellML is being developed by the Auckland Bioengineering Institute at the University of Auckland and affiliated research groups. The latest stable specification is Version 1.1. CellML development is coordinated by an elected editorial board. APIs are available to help implementing support: CellML API in C. |
The Synthetic Biology Open Language Data (SBOL Data) is a language for the description and the exchange of synthetic biological parts, devices and systems. The latest stable specification of SBOL Data is 2.2.0. SBOL Data is developed by the SBOL Developers Group. The development is coordinated by an editorial board and the SBOL Chair. SBOL data is supported by many software tools. APIs are available to help implement the support of this data standard. |
The Synthetic Open Language Visual (SBOL Visual) is an open-source graphical notation that uses schematic “glyphs” to specify genetic parts, devices, modules, and systems. The latest stable specification of SBOL Visual is 2.0.0. SBOL is developed by the SBOL Developers Group and SBOL Visual Group. The development is coordinated by an editorial board and the SBOL Chair. SBOL Visual is supported by many software tools. |
The NeuroML project focuses on the development of an XML based description language that provides a common data format for defining and exchanging descriptions of neuronal cell and network models. The latest stable specification of NeuroML is version 2 beta 4. NeuroML development is coordinated by the NeuroML Editorial Board. NeuroML is supported by many software tools and databases, see here. |
Associated standardization efforts
The standardisation efforts described below are not community-developed representation formats. However, they are tools to add a layer of semantics that facilitate the use, the interoperability or enhance the usefulness of COMBINE representation formats.
COMBINE Archive
A COMBINE archive is a single file bundling the various documents necessary for a modelling and simulation project, and all relevant information. The archive is encoded using the Open Modeling EXchange format (OMEX).
COMBINE Archive Metadata
COMBINE archive metadata provides a harmonized, community-driven approach for annotating a variety of standardized model and data representation formats within a COMBINE archive.
Identifiers.org URIs
MIRIAM Unique Resource Identifiers allow one to uniquely and unambiguously identify an entity in a stable and perennial manner. MIRIAM Registry is a set of services and resources that provide support for generating, interpreting and resolving MIRIAM URIs. Through the Identifiers.org technology, MIRIAM URIs can be dereferenced in a flexible and robust way.
MIRIAM URIs are used by SBML, SED-ML, CellML and BioPAX controlled annotation schemes.
Systems Biology Ontology
The Systems Biology Ontology (SBO) is a set of controlled, relational vocabularies of terms commonly used in Systems Biology, and in particular in computational modeling.
Each element of an SBML file carries an optional attribute sboTerm which value must be a term from SBO.
Each symbol of SBGN is associated with an SBO term.
Kinetic Simulation Algorithm Ontology
The Kinetic Simulation Algorithm Ontology (KiSAO) describes existing algorithms and their inter-relationships through their characteristics and parameters.
KiSAO is used in SED-ML, which allows simulation software to automatically choose the best algorithm available to perform a simulation and unambiguously refer to it.
BioModels.net qualifiers
BioModels.net qualifiers are standardized relationships (predicates) that specify the relation between an object represented in a description language and the external resource used to annotate it. The relationship is rarely one-to-one, and the information content of an annotation is greatly increased if one knows what it represents, rather than only know it is "related to" the model component.
Related standardization efforts
The following standardization efforts are of interest for COMBINE, either as candidate standards, or similar efforts in different domains.
Computational Neuroscience Ontology
The Computational Neuroscience Ontology (CNO) is a controlled vocabulary composed of classes representing general concepts related to computational neuroscience. More ...
FieldML
FieldML's (Field Modelling/Markup Language) goal is to be a declarative language for building hierarchical models represented by generalized mathematical fields. Its primary use will be to represent the dynamic geometry and solution fields from computational models of cells, tissues and organs.
FROG
FROG analysis - a community standard to foster reproducibility and curation of constraint-based models. FROG provides guidelines, best practices, and a set of standardized FBA analyses to assess reproducibility and curation efforts.
FSK-ML
FSK-ML's (Food Safety Knowledge Markup Language) aims at encoding experimental data and mathematical models from the domain of predictive microbial modelling (and beyond) in a software independent manner. An FSK-ML file is an OMEX container, as the COMBINE Archive, and re-use other COMBINE standards to store models and simulations. A former version of the standard was the PMF-ML, and OpenML for Predictive Modelling in Food.
GPML
GPML (GenMAPP Pathway Markup Language) is an XML-based format to define a pathway consisting of purely graphical elements (such as lines and shapes) or graphical elements with added biological information (such as genes, proteins and datanodes).
MAMO
The mathematical modelling ontology (MAMO) is an ontology describing and classifying the mathematical models used in the life sciences (for the time being). MAMO provides the types of models, the variables they use, the readout to expect and other relevant features.
NineML
The Network Interchange for Neuroscience Modeling Language (NineML) - is a language developed by the International Neuroinformatics Coordinating Facility (INCF) and designed for the description of large networks of spiking neurons.
NuML
The Numerical Markup Language (NuML) (pronounce "neumeul" and not "new em el", that sounds like NewML) is a simple XML format to exchange multidimensional arrays of numbers to be used with model and simulation descriptions. NuML was initially developed as part of the Systems Biology Results Markup Language (SBRML).
PEtab is an SBML and TSV based data format for parameter estimation problems in systems biology.
The Pharmacometrics Markup Language is an exchange format for encoding of models, associated tasks and their annotation as used in pharmacometrics. PharmML is developed by the DDMoRe consortium, an European Innovative Medicines Initiative (IMI) project.
PSI-MI
The Proteomics Standards Initiative Molecular Interaction XML Format is a a data exchange format for molecular interactions developed by the the HUPO Proteomics Standards Initiative
The Spiking Neural Mark-up Language (SpineML) is a declarative XML based model description language for large scale neural network models. It is partially based upon on the INCF NineML.
The Terminology for the Description of Dynamics is a project to build an ontology for dynamical behaviours, observable dynamical phenomena, and control elements of bio-models and biological systems in Systems Biology and Synthetic Biology.
More information about standards used to share data in life sciences can be found at FAIRsharing. The FAIRsharing team manually curates data and metadata standards, databases and data policies across all scientific research areas. FAIRsharing provides links among these standards and databases as well as to journal and funder data policies that recommend or endorse their use. For example, FAIRsharing contains a collection of the COMBINE standards as well as many Systems Biology resources.
Attachment | Size |
---|---|
FROG_analysis_200x95.png | 9.39 KB |